AITopics | learning agent

How and Why to Manipulate Your Own Agent: On the Incentives of Users of Learning Agents

Neural Information Processing SystemsDec-25-2025, 01:21:37 GMT

The usage of automated learning agents is becoming increasingly prevalent in many online economic applications such as online auctions and automated trading. Motivated by such applications, this paper is dedicated to fundamental modeling and analysis of the strategic situations that the users of automated learning agents are facing. We consider strategic settings where several users engage in a repeated online interaction, assisted by regret-minimizing learning agents that repeatedly play a game on their behalf. We propose to view the outcomes of the agents' dynamics as inducing a meta-game between the users. Our main focus is on whether users can benefit in this meta-game from manipulating their own agents by misreporting their parameters to them. We define a general framework to model and analyze these strategic interactions between users of learning agents for general games and analyze the equilibria induced between the users in three classes of games. We show that, generally, users have incentives to misreport their parameters to their own agents, and that such strategic user behavior can lead to very different outcomes than those anticipated by standard analysis.

agent, name change, own agent, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

Learning to Incentivize Other Learning Agents

Neural Information Processing SystemsDec-24-2025, 10:51:57 GMT

The challenge of developing powerful and general Reinforcement Learning (RL) agents has received increasing attention in recent years. Much of this effort has focused on the single-agent setting, in which an agent maximizes a predefined extrinsic reward function. However, a long-term question inevitably arises: how will such independent agents cooperate when they are continually learning and acting in a shared multi-agent environment? Observing that humans often provide incentives to influence others' behavior, we propose to equip each RL agent in a multi-agent environment with the ability to give rewards directly to other agents, using a learned incentive function. Each agent learns its own incentive function by explicitly accounting for its impact on the learning of recipients and, through them, the impact on its own extrinsic objective. We demonstrate in experiments that such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games, often by finding a near-optimal division of labor. Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.

learning, learning agent, name change, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

Peer Prediction for Learning Agents

Neural Information Processing SystemsDec-24-2025, 10:17:11 GMT

Peer prediction refers to a collection of mechanisms for eliciting information from human agents when direct verification of the obtained information is unavailable. They are designed to have a game-theoretic equilibrium where everyone reveals their private information truthfully. This result holds under the assumption that agents are Bayesian and they each adopt a fixed strategy across all tasks. Human agents however are observed in many domains to exhibit learning behavior in sequential settings. In this paper, we explore the dynamics of sequential peer prediction mechanisms when participants are learning agents.

agent, algorithm, peer prediction, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

Transfer in Reinforcement Learning via Regret Bounds for Learning Agents

Tuynman, Adrienne, Ortner, Ronald

arXiv.org Artificial IntelligenceNov-14-2025

We present an approach for the quantification of the usefulness of transfer in reinforcement learning via regret bounds for a multi-agent setting. Considering a number of $\aleph$ agents operating in the same Markov decision process, however possibly with different reward functions, we consider the regret each agent suffers with respect to an optimal policy maximizing her average reward. We show that when the agents share their observations the total regret of all agents is smaller by a factor of $\sqrt{\aleph}$ compared to the case when each agent has to rely on the information collected by herself. This result demonstrates how considering the regret in multi-agent settings can provide theoretical bounds on the benefit of sharing observations in transfer learning.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2202.01182

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Contracting with a Learning Agent

Neural Information Processing SystemsMay-27-2025, 08:23:35 GMT

Real-life contractual relations typically involve repeated interactions between the principal and agent, where, despite theoretical appeal, players rarely use complex dynamic strategies and instead manage uncertainty through learning algorithms.In this paper, we initiate the study of repeated contracts with learning agents, focusing on those achieving no-regret outcomes. For the canonical setting where the agent's actions result in success or failure, we present a simple, optimal solution for the principal: Initially provide a linear contract with scalar \alpha 0, then switch to a zero-scalar contract. This shift causes the agent to "free-fall" through their action space, yielding non-zero rewards for the principal at zero cost. Interestingly, despite the apparent exploitation, there are instances where our dynamic contract can make \emph{both} players better off compared to the best static contract. We then broaden the scope of our results to general linearly-scaled contracts, and, finally, to the best of our knowledge, we provide the first analysis of optimization against learning agents with uncertainty about the time horizon.

artificial intelligence, contract, machine learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

Randomized Truthful Auctions with Learning Agents

Neural Information Processing SystemsMay-26-2025, 22:55:54 GMT

We study a setting where agents use no-regret learning algorithms to participate in repeated auctions. Recently, Kolumbus and Nisan [2022a] showed, rather surprisingly, that when bidders participate in second-price auctions using no-regret bidding algorithms, no matter how large the number of interactions T is, the runner-up bidder may not converge to bidding truthfully. Our first result shows that this holds forall deterministictruthful auctions. We also show that the ratio of the learning rates of different bidders can qualitatively affect the convergence of the bidders. Next, we consider the problem of revenue maximization in this environment. In the setting with fully rational bidders, the seminal result of Myerson [1981] showed that revenue can be maximized by using a second-price auction with reserves.

artificial intelligence, auction, machine learning, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Review for NeurIPS paper: Learning to Incentivize Other Learning Agents

Neural Information Processing SystemsJan-27-2025, 14:17:28 GMT

Weaknesses: I have two concerns on (1) baselines and (2) scalability. IA is a good one and it is nice to see that LIO outperforms IA, but I do think the results can be more convincing if more benchmark algorithms can be included. Mutual information can be also viewed as an approximation of accounting other agents' future policy change and has shown great performances in harvest/cleanup with a large number of agents. Can we simply learn a value function conditioned on the received rewards of different agents (in the same spirit of DDPG) so that we can avoid performing second-order gradient? These are the questions raised when I read the paper and I believe a more in-depth discussion/experiments will further consolidate the contribution of this work.

experiment, learning agent, neurips paper, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.64)

Add feedback

How and Why to Manipulate Your Own Agent: On the Incentives of Users of Learning Agents

Neural Information Processing SystemsJan-18-2025, 15:26:23 GMT

The usage of automated learning agents is becoming increasingly prevalent in many online economic applications such as online auctions and automated trading. Motivated by such applications, this paper is dedicated to fundamental modeling and analysis of the strategic situations that the users of automated learning agents are facing. We consider strategic settings where several users engage in a repeated online interaction, assisted by regret-minimizing learning agents that repeatedly play a "game" on their behalf. We propose to view the outcomes of the agents' dynamics as inducing a "meta-game" between the users. Our main focus is on whether users can benefit in this meta-game from "manipulating" their own agents by misreporting their parameters to them.

agent, learning agent, own agent, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Peer Prediction for Learning Agents

Neural Information Processing SystemsOct-11-2024, 12:43:10 GMT

Peer prediction refers to a collection of mechanisms for eliciting information from human agents when direct verification of the obtained information is unavailable. They are designed to have a game-theoretic equilibrium where everyone reveals their private information truthfully. This result holds under the assumption that agents are Bayesian and they each adopt a fixed strategy across all tasks. Human agents however are observed in many domains to exhibit learning behavior in sequential settings. In this paper, we explore the dynamics of sequential peer prediction mechanisms when participants are learning agents.

agent, algorithm, peer prediction, (5 more...)

Neural Information Processing Systems

Genre: Play > Prospect (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Learning to Incentivize Other Learning Agents

Neural Information Processing SystemsOct-11-2024, 02:37:37 GMT

The challenge of developing powerful and general Reinforcement Learning (RL) agents has received increasing attention in recent years. Much of this effort has focused on the single-agent setting, in which an agent maximizes a predefined extrinsic reward function. However, a long-term question inevitably arises: how will such independent agents cooperate when they are continually learning and acting in a shared multi-agent environment? Observing that humans often provide incentives to influence others' behavior, we propose to equip each RL agent in a multi-agent environment with the ability to give rewards directly to other agents, using a learned incentive function. Each agent learns its own incentive function by explicitly accounting for its impact on the learning of recipients and, through them, the impact on its own extrinsic objective. We demonstrate in experiments that such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games, often by finding a near-optimal division of labor.

incentive function, learning agent, multi-agent environment

Neural Information Processing Systems

Technology: